Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_tail: Preventing incorrect inode usage from db. #8025 #8062

Merged
merged 2 commits into from
Apr 9, 2024

Conversation

jinyongchoi
Copy link
Contributor

@jinyongchoi jinyongchoi commented Oct 19, 2023

Preventing incorrect inode usage from db.

The first commit involves deleting items that are not being monitored by Fluent-Bit when it starts.

The second commit involves considering already running Fluent-Bit instances, preventing incorrect inode usage during execution.
This commit is more about enhancement option than fixing a bug. That's why 'true' and 'false' may be needed depending on the user's use case.

Fixes #8025


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change

  • fluent-bit-test.conf


[SERVICE]
    flush        1
    daemon       Off
    log_level    info

    parsers_file parsers.conf
    plugins_file plugins.conf
    storage.metrics on
    storage.path /tmp/storage
    storage.sync normal
    storage.checksum off
    storage.backlog.mem_limit 5M

[INPUT]
    Name tail
    Path /tmp/testing/bulk*
    Exclude_Path *.gz,*.zip
    log_level    info
    Offset_Key   offset

    Read_from_Head true
    Refresh_Interval 1
    Rotate_Wait 1
    Inotify_Watcher false
    
    storage.type filesystem
    storage.pause_on_chunks_overlimit true

    DB /tmp/input.db
    DB.sync normal
    DB.locking false
    # new config
    DB.compare_filename true

[OUTPUT]
    name  stdout
    match *

  • gen.sh
    To account for situations where file events may not be properly detected, the creation and deletion process is repeated without using 'sleep'.

#!/bin/bash

max_iterations=100000

for ((iteration = 1; iteration <= max_iterations; iteration++)); do
    start_range=$((($iteration - 1) * 100 + 1))
    end_range=$(($iteration * 100))

    for i in $(seq $start_range $end_range); do
        filename="bulk$i.log"
        echo "$i." >> "$filename"
        stat -c%i "$filename"
    done

    sleep 0

    for i in $(seq $start_range $end_range); do
        filename="bulk$i.log"
        rm "$filename"
    done
done

  • (1/2) delete stale inode from db
mkdir -p  /tmp/testing/
cd /tmp/testing/
rm -f /tmp/input.db
rm -f /tmp/testing/*.log

# fluentbit start
/opt/fluent-bit/bin/fluent-bit_fixed -c /opt/fluent-bit/etc/fluent-bit/fluent-bit-test.conf

# write
echo "1" > /tmp/testing/bulk.log;stat -c %i /tmp/testing/bulk.log

# fluentbit stop

# logroate and create new one.
rm -f /tmp/testing/bulk.log

# fluentbit start
/opt/fluent-bit/bin/fluent-bit_fixed -c /opt/fluent-bit/etc/fluent-bit/fluent-bit-test.conf

# log
[2023/10/19 05:50:40] [ info] [input:tail:tail.0] db: delete unmonitored stale inodes from the database: count=1

  • (2/2) If the filename differs when retrieving the inode from the database, the inode is deleted.
mkdir -p  /tmp/testing/
cd /tmp/testing/
rm -f /tmp/input.db
rm -f /tmp/testing/*.log

# fluentbit start
/opt/fluent-bit/bin/fluent-bit_fixed -c /opt/fluent-bit/etc/fluent-bit/fluent-bit-test.conf

# run gen script
while true; do sh ./gen.sh; sleep 0; done

# log
# Fluentbit can read or may fail to read, but it must not read incorrectly.
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: exists stale file from database: id=19 inode=4491280 offset=5 name=/tmp/testing/bulk126.log file_inode=4491280 file_name=/tmp/testing/bulk925.log
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: stale file deleted from database: id=19
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: exists stale file from database: id=20 inode=4491305 offset=5 name=/tmp/testing/bulk127.log file_inode=4491305 file_name=/tmp/testing/bulk926.log
[2023/10/19 06:24:14] [ info] [input:tail:tail.0] db: stale file deleted from database: id=20
  • Debug log output from testing the change
(1/2) commit
[2023/10/19 05:13:58] [ info] [input:tail:tail.0] db: delete unmonitored stale inodes from the database: count=1
(2/2) commit
[2023/10/19 05:14:03] [ info] [input:tail:tail.0] db: exists stale file from database: id=1 inode=29632163 offset=12 name=test_db.log file_inode=29632163 file_name=test_db_moved.log
[2023/10/19 05:14:03] [ info] [input:tail:tail.0] db: stale file deleted from database: id=1
  • Attached Valgrind output that shows no leaks or memory corruption was found
valgrind --leak-check=full ./bin/flb-rt-in_tail
...
==120724== HEAP SUMMARY:
==120724==     in use at exit: 0 bytes in 0 blocks
==120724==   total heap usage: 98,919 allocs, 98,919 frees, 32,217,198 bytes allocated
==120724== 
==120724== All heap blocks were freed -- no leaks are possible
==120724== 
==120724== For lists of detected and suppressed errors, rerun with: -s
==120724== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

#1238

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@jinyongchoi jinyongchoi temporarily deployed to pr October 19, 2023 06:37 — with GitHub Actions Inactive
@jinyongchoi jinyongchoi temporarily deployed to pr October 19, 2023 06:37 — with GitHub Actions Inactive
@jinyongchoi jinyongchoi temporarily deployed to pr October 19, 2023 06:37 — with GitHub Actions Inactive
@jinyongchoi jinyongchoi temporarily deployed to pr October 19, 2023 07:06 — with GitHub Actions Inactive
@jinyongchoi jinyongchoi temporarily deployed to pr October 19, 2023 14:19 — with GitHub Actions Inactive
@jinyongchoi jinyongchoi temporarily deployed to pr October 19, 2023 14:19 — with GitHub Actions Inactive
@jinyongchoi jinyongchoi temporarily deployed to pr October 19, 2023 14:19 — with GitHub Actions Inactive
@jinyongchoi jinyongchoi temporarily deployed to pr October 19, 2023 14:47 — with GitHub Actions Inactive
@jinyongchoi
Copy link
Contributor Author

This PR is now ready for review.
Thanks!

Copy link
Contributor

@pwhelan pwhelan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a few observations concerning error checking. The actual functionality seems to be robust and the tests seem to actually prove the patch works as expected.

plugins/in_tail/tail_db.c Outdated Show resolved Hide resolved
plugins/in_tail/tail_db.c Outdated Show resolved Hide resolved
plugins/in_tail/tail_db.c Outdated Show resolved Hide resolved
plugins/in_tail/tail_db.c Outdated Show resolved Hide resolved
plugins/in_tail/tail_db.c Show resolved Hide resolved
plugins/in_tail/tail_db.c Outdated Show resolved Hide resolved
@edsiper
Copy link
Member

edsiper commented Dec 21, 2023

@jinyongchoi thanks for your PR looking forward to merge this, would you please address the comments pointed out by @pwhelan ? thank you

… (1/2)

To prevent incorrect inode references,
FluentBit automatically removes unmanaged inodes during startup.

Signed-off-by: jinyong.choi <[email protected]>
…luent#8025)(2/2)

When checking the existence of a file's inode, if the 'compare_filename'
option is enabled, it is modified to compare the filename as well.
If the inode matches but the filename is different, it removes the stale
inode from the database.

Signed-off-by: jinyong.choi <[email protected]>
@jinyongchoi
Copy link
Contributor Author

@pwhelan Thank you for the review.
@edsiper I have addressed the points mentioned by @pwhelan.
Please review the changes at your convenience and let me know if any further adjustments are needed.
thank you!

@RicardoAAD
Copy link
Collaborator

Tested the new implementation, and it works as expected. When the option DB.compare_filename is enabled, the incorrect offset issue doesn't occur

@pwhelan
Copy link
Contributor

pwhelan commented Jan 2, 2024

@edsiper I have checked and all my comments have been resolved.

Copy link
Contributor

@pwhelan pwhelan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lecaros lecaros added this to the Fluent Bit v3.0.0 milestone Feb 22, 2024
@edsiper edsiper merged commit e4d8d8c into fluent:master Apr 9, 2024
74 checks passed
@edsiper
Copy link
Member

edsiper commented Apr 9, 2024

thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-required ok-package-test Run PR packaging tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

in_tail: missing log for offset processing due to non-existent old inodes in sqlite.
7 participants